Automated Classification of Overfitting Patches With Statically Extracted Code Features

نویسندگان

چکیده

Automatic program repair (APR) aims to reduce the cost of manually fixing software defects. However, APR suffers from generating a multitude overfitting patches, those patches that fail correctly defect beyond making tests pass. This paper presents novel patch detection system called ODS assess correctness patches. first statically compares patched and buggy in order extract code features at abstract syntax tree (AST) level. Then, uses supervised learning with captured labels automatically learn probabilistic model. The learned model can then finally be applied classify new unseen We conduct large-scale experiment evaluate effectiveness on classification based 10,302 Defects4J, Bugs.jar Bears benchmarks. empirical evaluation shows is able 71.9% 26 projects, which improves state-of-the-art. applicable practice employed as post-processing procedure generated by different systems.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Sentiment Classification using Automatically Extracted Subgraph Features

In this work, we propose a novel representation of text based on patterns derived from linguistic annotation graphs. We use a subgraph mining algorithm to automatically derive features as frequent subgraphs from the annotation graph. This process generates a very large number of features, many of which are highly correlated. We propose a genetic programming based approach to feature constructio...

متن کامل

Fuzzy Classification of Physiographic Features Extracted from Multiscale DEMs

Geomorphological landforms are generally viewed as Boolean objects. However, recent studies have shown that landforms are more suitable to be viewed as fuzzy objects, whereby a landform is defined as a region in the continuum of variation of the surface of the earth. In this paper, the fuzzy classification of physiographic features extracted from multiscale DEMs is performed. First, the lifting...

متن کامل

Sentiment Classification Using Semantic Features Extracted from WordNet-based Resources

In this paper, we concentrate on the 3 of the tracks proposed in the NTCIR 8 MOAT, concerning the classification of sentences according to their opinionatedness, relevance and polarity. We propose a method for the detection of opinions, relevance, and polarity classification, based on ISR-WN (a resource for the multidimensional analysis with Relevant Semantic Trees of sentences using different ...

متن کامل

Exploring Discriminatory Features for Automated Malware Classification

The ever-growing malware threat in the cyber space calls for techniques that are more effective than widely deployed signature-based detection systems and more scalable than manual reverse engineering by forensic experts. To counter large volumes of malware variants, machine learning techniques have been applied recently for automated malware classification. Despite the successes made from thes...

متن کامل

Land Cover Classification of Landsat Data with Phenological Features Extracted from Time Series MODIS NDVI Data

Temporal-related features are important for improving land cover classification accuracy using remote sensing data. This study investigated the efficacy of phenological features extracted from time series MODIS Normalized Difference Vegetation Index (NDVI) data in improving the land cover classification accuracy of Landsat data. The MODIS NDVI data were first fused with Landsat data via the Spa...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Transactions on Software Engineering

سال: 2022

ISSN: ['0098-5589', '1939-3520', '2326-3881']

DOI: https://doi.org/10.1109/tse.2021.3071750